Goal: Describe variation in MFSR Chinook Salmon spawn timing (i.e., putative redd completion date), and how it relates to environmental covariates.
Objectives:
compile data on spawn timing, stream temperature, stream flow, elevation, and slope for Chinook Salmon in the Middle Fork Salmon River (MFSR) from 2002 to 2005.
compute several summary statistics and data visualizations to examine variation in MFSR Chinook Salmon spawn timing.
fit a series of linear mixed models (LMMs) to test for relationships between spawn timing (day of year) and environmental covariates.
This study was conducted in the Middle Fork of the Salmon River (MFSR) in central Idaho (Fig. 1). The MFSR is a tributary of the Salmon River and is part of the larger Columbia River Basin. The MFSR is home to several species of salmon, including Chinook salmon (Oncorhynchus tshawytscha).
Figure 1: Map of the Middle Fork Salmon River watershed showing major tributaries and surveyed Chinoon Salmon redd locations from 2002 to 2005.
Spawn timing data for Chinook salmon were collected from 2001 to 2005 in the MFSR. We removed data from 2001, and data from Knapp Creek and Cape Horn Creek, as these sites were not consistently sampled. Because redds were not observed daily, we inferred spawn dates as the initial date (day of year) a completed redd was observed.
We spatially joined each redd GPS location to the NHDPlus Version 2 (Horizon Systems, 2018) to assign stream reaches based on a common identifier (COMID). The COMID is used to link redd data with covariate data associated with the stream reach on which it is located. A summary of the dataset is shown in Table 1.
| Name | spawn_data |
| Number of rows | 3016 |
| Number of columns | 6 |
| _______________________ | |
| Column type frequency: | |
| Date | 1 |
| factor | 4 |
| numeric | 1 |
| ________________________ | |
| Group variables | None |
Variable type: Date
| skim_variable | n_missing | complete_rate | min | max | median | n_unique |
|---|---|---|---|---|---|---|
| spawn_date | 0 | 1 | 2002-08-02 | 2005-09-11 | 2003-08-22 | 115 |
Variable type: factor
| skim_variable | n_missing | complete_rate | ordered | n_unique | top_counts |
|---|---|---|---|---|---|
| redd_id | 0 | 1 | FALSE | 3016 | 129: 1, 129: 1, 129: 1, 129: 1 |
| COMID | 0 | 1 | FALSE | 104 | 235: 216, 235: 200, 235: 144, 235: 125 |
| stream | 0 | 1 | FALSE | 8 | Big: 647, Bea: 479, Elk: 471, Mar: 368 |
| year | 0 | 1 | FALSE | 4 | 200: 1217, 200: 1187, 200: 404, 200: 208 |
Variable type: numeric
| skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 |
|---|---|---|---|---|---|---|---|---|---|
| yday | 0 | 1 | 239.7 | 8.36 | 206 | 234 | 241 | 246 | 263 |
Figure 2 provides a visual summary of the distribution of spawn timing (day of year, yday), and variation by stream system and year.
Spawn timing is generally unimodal, with a peak in late August to early September. The distribution is slightly left skewed, but the mean and median are similar, indicating low skewness. The variance is low, suggesting no overdispersion. Suggests Poisson family for response if count of spawning events per day, or Gaussian if modeling day of year as continuous.
There is also substantial variation in spawn date by stream and year (Fig. 2B-C; Fig. 3) and within COMIDs within streams (Fig. 4).
Figure 2: Histogram and density of spawn timing for all streams and years (A), and by stream (B) and year (C). In panel (A), colored, vertical lines indicate the mean spawn date for each year., and the black vertical line indicated the global mean.
Figure 3: Spawning phenology of adult Chinook Salmon. In all panels, the black density function represented stream-level spawn timing, while the colored density functions represent the spawn timing of individual years. The dashed vertical purple lines represent the 5th and 95th percentiles of the basin-wide spawn timing.
Figure 4: Distribution of spawn time variation by COMID and stream.
The temporal progression of Chinook salmon spawning activity varied within each stream across the four study years (Figure 5). The cumulative proportion of redds provides an intuitive measure of the spawning season’s pace and timing. Streams like Marsh and Sulphur exhibit rapid increases, suggesting short, concentrated spawning windows. In contrast, streams like Big and Camas show more gradual increases, indicating a more protracted spawning season. Year-to-year variation is evident in several streams. Overall, this figure highlights both spatial and temporal heterogeneity in spawn timing that underpins the need for models incorporating stream- and year-specific effects.
Figure 5: Proportion of cumulative Chinook salmon redds over time (day of year) across years (2002–2005) and streams. Each line represents a different year, with color denoting the year. Stream-specific panels illustrate temporal variation in the progression of spawning activity, as measured by cumulative redd counts normalized to the maximum value in each stream-year combination.
Redd observations are nested within COMID, which is nested within stream. There are repeated measures on COMID, so random effects are likely needed to account for pseudoreplication (they are correlated).
Table 2 shows the number of COMIDs per stream, and Figure 6 shows the number of observations (unique redds) per COMID.
| stream | n_COMIDs |
|---|---|
| Big | 21 |
| Loon | 19 |
| Camas | 16 |
| Marsh | 13 |
| Bear Valley | 11 |
| Sulphur | 11 |
| Beaver | 7 |
| Elk | 6 |
Figure 6: Number of observations (redds) per COMID.
Many COMIDs only have 1-2 observation, so the groups are not well sampled. There are 23 COMIDs with <5 redds (26%), 13 with <= 2. (12.5%). With <5 obs/level, variance estimates can become unstable and lead to overfitting and absorbing noise (low AIC / high R2) and singular fits. Something to keep in mind during model fitting.
To test for environmental factors driving variation in spawn timing, we quantified associations with metrics describing thermal and physical conditions in stream reaches. We selected and interrogated covariates based on the following criteria: (1) they are known to influence spawn timing, (2) they are available for all streams, and (3) they are not highly correlated with each other.
Our initial, focal independent variables are:
Figure 7: Modeled thermal regimes (2001-2005) for MFSR tributaries.
We calculated metrics relative to a COMID a redd was constructed on and a redd completion date; before, after, and spanning the date. For example, temp_30_before is the average temperature for a COMID where a redd was constructed calculated over the previous 30 days. We did this for 30, 60, and 90 days. We also calculated a time invariant metric relative to a fixed date across all years that was chosen to represent an initial spawning window, e.g., August 1.
The time invariant and after metrics were omitted from further consideration as preliminary data exploration showed weak if any relationship with spawn timing.
Stream flow data were compiled from a single USGS Gage lower in the watershed (MF Salmon River at MF Lodge NR Yellow Pine ID - 13309220; Figure 8).
Figure 8: Inter-annual variability in daily discharge (cfs) at MF Lodge USGS Gage 13309220.
We calculated flow metrics relative to a COMID a redd was constructed on and a redd completion date; before, after, and spanning the date.For example, temp_30_before is the average temperature for a COMID where a redd was constructed calculated over the previous 30 days. We did this for 30, 60, and 90 days.
The spanning and after metrics were omitted from further consideration as preliminary data exploration showed weak if any relationship with spawn timing.
Elevation and stream slope data were available at the COMID (stream reach) scale from the NHD.
We did exploratory data analysis on the compiled dataset to identify candidate covariates for inclusion in model selection.
Bivariate relationships between spawn date (yday) and continuous predictors (Figure 9), and pairwise correlations between all continuous covariates (Figure 10), are shown below.
Figure 9: Bivariate relationships between spawn date (yday) and continuous covariates. Solid lines are linear fits.
Figure 10: Pairwise correlations between continuous covariates.
Takeaways from Figures 9 and 10:
Amoung temperature variables, temp_90 clearly shows the strongest relationship with spawn date (Figure 9), and is highly colinear with temp_30 and temp_60 (Figure 10).
The is a weak negative relationship between mean_elevation and yday (Figure 9), and it is weakly correlated with temp_90 (0.31, Figure 10).
flow_30 = decaying exponential, obvious year effectflow_60 = dittoflow_90 = inflections, year effect clearslope = no relationshipWe will now explore the individual covariates in more detail, focusing on temp_90, mean_elevation, slope, and flow_90.
temp_90Looking more closely at temp_90, Figure 11 shows the relationship between temp_90 and spawn date by stream and year. Clearly a stream effect and year effect, with some wonky stuff going on in 2005.
Figure 11: Bivariate relationship between temp_90 and spawn date by stream and year. Solid lines are linear fits.
mean_elevationFigure 12 shows the relationship between (A) mean_elevation and temp_90 and (B) mean_elevation and yday. The relationship is consistent for Big, Camus, and Loon across years, but is broken down or truncated at high elevation streams (Bear, March, Beaver).
Figure 12: Bivariate relationship between mean_elevation and temp_90 by stream and year. Solid lines are linear fits.
Elevation could be considered a proxy for many site-level gradients (temperature, flow timing, etc.). If we already have direct, mechanistic variables like temp_90, elevation may be redundant or confounding. Although the physical factor of elevation could be the underlying driver of spawn timing, it is not a direct measure of the thermal or hydrological conditions that influence spawn timing. If we do end up tossing elevation later, we can use the following text:
We initially included mean elevation as a covariate to account for broader geographic and thermal gradients. However, elevation was highly collinear with stream and temperature, and its inclusion led to unstable model behavior and uninterpretable parameter estimates. Therefore, we excluded elevation from the final models to reduce confounding and overfitting risk.
slopeslope is not related to yday (Figure 13A), though there is some association between slope and yday when considering stream (Figure 13B). We will likely drop slope.
Figure 13: Bivariate relationship between slope and spawn date (A) and by stream (B). Solid lines are linear fits.
flow_90Because flow data are not COMID- or stream-specific, it makes sense to think about and represent flow as an out-of-basin year effect that determines when adults make it back to the MFSR and initially onto the spawning grounds.
The strong correlation between flow_90 and year can be seen clearly in (Figure 14A).
Figure 14: Spawn timing vs. 90-day mean discharge pre spawn at MF Lodge.
Next we compare the following simple linear models to examine functional structure:
# flow_90 models
m1 <- lm(yday ~ flow_90, data = model_data)
m2 <- lm(yday ~ I(flow_90^2), data = model_data)
m3 <- lm(yday ~ flow_90 + year, data = model_data)
m4 <- lm(yday ~ flow_90 + year + stream, data = model_data)
m5 <- lm(yday ~ flow_90 * stream, data = model_data)
m6 <- lm(yday ~ flow_90 * stream + year, data = model_data)
m7 <- lm(yday ~ flow_90 * year + stream, data = model_data)
m8 <- lm(yday ~ flow_90 * stream + I(flow_90^2), data = model_data)
m9 <- lm(yday ~ flow_90 * stream + year + I(flow_90^2), data = model_data)
AIC scores suggest m7 is best model (Table 3). However, while the R^2 value is 0.98, the parameter estimate for flow_90 has is incredibly small and most variation is now attributed to the year contrasts (Table 4). Thus, flow_90 is clearly confounded with year, and confirmed by high Variance Inflation Factor (VIF) scores (Figure 15).
| Name | Model | R2 | R2_adjusted | RMSE | Sigma | AIC_wt | AICc_wt | BIC_wt | Performance_Score |
|---|---|---|---|---|---|---|---|---|---|
| m7 | lm | 0.9780529 | 0.9779505 | 1.237710 | 1.240800 | 1 | 1 | 1 | 1.0000000 |
| m9 | lm | 0.9407030 | 0.9403269 | 2.034449 | 2.041229 | 0 | 0 | 0 | 0.4960534 |
| m6 | lm | 0.9108990 | 0.9103638 | 2.493858 | 2.501751 | 0 | 0 | 0 | 0.4472891 |
| m4 | lm | 0.8913935 | 0.8909958 | 2.753330 | 2.758824 | 0 | 0 | 0 | 0.4181935 |
| m3 | lm | 0.8745347 | 0.8743681 | 2.959322 | 2.961778 | 0 | 0 | 0 | 0.3942842 |
| m8 | lm | 0.7348812 | 0.7334668 | 4.301804 | 4.313980 | 0 | 0 | 0 | 0.2174139 |
| m5 | lm | 0.7130109 | 0.7115760 | 4.475722 | 4.487641 | 0 | 0 | 0 | 0.1921569 |
| m1 | lm | 0.5900974 | 0.5899614 | 5.348977 | 5.350752 | 0 | 0 | 0 | 0.0576227 |
| m2 | lm | 0.5352432 | 0.5350890 | 5.695650 | 5.697539 | 0 | 0 | 0 | 0.0000000 |
| Parameter | Coefficient | SE | CI | CI_low | CI_high | t | df_error | p |
|---|---|---|---|---|---|---|---|---|
| (Intercept) | 278.2566486 | 0.2130585 | 0.95 | 277.8388932 | 278.6744041 | 1306.0106434 | 3001 | 0.0000000 |
| flow_90 | -0.0219390 | 0.0001136 | 0.95 | -0.0221617 | -0.0217163 | -193.1695196 | 3001 | 0.0000000 |
| year2003 | -8.3607844 | 0.2371172 | 0.95 | -8.8257131 | -7.8958557 | -35.2601350 | 3001 | 0.0000000 |
| year2004 | 11.0240656 | 0.3856186 | 0.95 | 10.2679620 | 11.7801692 | 28.5879998 | 3001 | 0.0000000 |
| year2005 | 1.0123700 | 0.7619753 | 0.95 | -0.4816766 | 2.5064167 | 1.3286127 | 3001 | 0.1840768 |
| streamBeaver | 0.5591186 | 0.1048594 | 0.95 | 0.3535151 | 0.7647220 | 5.3320808 | 3001 | 0.0000001 |
| streamBig | -0.4395848 | 0.0772852 | 0.95 | -0.5911220 | -0.2880475 | -5.6878290 | 3001 | 0.0000000 |
| streamCamas | -0.0934454 | 0.0899470 | 0.95 | -0.2698094 | 0.0829186 | -1.0388944 | 3001 | 0.2989375 |
| streamElk | 0.2219142 | 0.0861007 | 0.95 | 0.0530918 | 0.3907365 | 2.5773800 | 3001 | 0.0100026 |
| streamLoon | 0.0895360 | 0.1073905 | 0.95 | -0.1210304 | 0.3001024 | 0.8337428 | 3001 | 0.4044923 |
| streamMarsh | -0.7143977 | 0.0913025 | 0.95 | -0.8934196 | -0.5353759 | -7.8245141 | 3001 | 0.0000000 |
| streamSulphur | -0.0665471 | 0.1017872 | 0.95 | -0.2661269 | 0.1330327 | -0.6537861 | 3001 | 0.5132997 |
| flow_90:year2003 | 0.0094506 | 0.0001190 | 0.95 | 0.0092172 | 0.0096840 | 79.3900831 | 3001 | 0.0000000 |
| flow_90:year2004 | -0.0101387 | 0.0002436 | 0.95 | -0.0106163 | -0.0096610 | -41.6197601 | 3001 | 0.0000000 |
| flow_90:year2005 | -0.0046105 | 0.0005766 | 0.95 | -0.0057411 | -0.0034799 | -7.9957348 | 3001 | 0.0000000 |
Figure 15: Variance inflation factors (VIF) for model m7, yday ~ flow_90 * year + stream.
flow_90 is problematicNot spatially resolved
flow_90 is calculated from a single downstream gauge, and applied to all reddsCorrelated with year
flow_90 varies mostly across years, it is strongly confounded with yearflow_90 and year risks collinearity, and may produce misleading inferencesSpawn-time aligned flow ≠ experienced flow
flow_90 is aligned to each redd’s spawn date, it still reflects a lower-basin gauge, not the actual hydrologic conditions experienced at the redd siteRecommendation:
Drop flow_90 from model.
Although we initially considered including 90-day mean streamflow (flow_90) as a predictor of spawn timing, this variable was ultimately excluded due to concerns about ecological validity and model overfitting. Stream flow data were derived from a single downstream USGS gauge and did not capture spatial variation across the study streams or reaches. Moreover, because flow_90 was closely aligned with year, it introduced strong collinearity with the year effect and risked attributing site-level variation to flow patterns not actually experienced by individual redds. As such, we excluded flow_90 to avoid misleading inference.
The final dataset includes:
yday: spawn date, continuous response variablecomid, stream, year: grouping variablestemp_90: 90-day mean temperature pre-spawn, continuous predictor variableslope and mean_elevation: provisionally retaining continuous predictor variableWe scaled the continuous covariates to have a mean of 0 and standard deviation of 1. This is important for mixed models, as it helps with convergence and interpretation (Table 5).
| yday | COMID | stream | year | temp_90_raw | mean_elevation_raw | slope_raw | temp_90 | mean_elevation | slope |
|---|---|---|---|---|---|---|---|---|---|
| 235 | 23519365 | Bear Valley | 2002 | 11.07830 | 1951.730 | 0.0029916 | -0.3178564 | 0.7181159 | -0.5189141 |
| 235 | 23519365 | Bear Valley | 2002 | 11.07830 | 1951.730 | 0.0029916 | -0.3178564 | 0.7181159 | -0.5189141 |
| 235 | 23519319 | Bear Valley | 2002 | 12.15472 | 1946.735 | 0.0017181 | 0.4665480 | 0.6954808 | -0.7107062 |
| 235 | 23519319 | Bear Valley | 2002 | 12.15472 | 1946.735 | 0.0017181 | 0.4665480 | 0.6954808 | -0.7107062 |
| 235 | 23519319 | Bear Valley | 2002 | 12.15472 | 1946.735 | 0.0017181 | 0.4665480 | 0.6954808 | -0.7107062 |
The structure of the data is repeated measures on COMIDs, and multiple years, shared across COMIDs within streams.
COMIDstreamyears| Variable | Fixed? | Random? | Why |
|---|---|---|---|
COMID |
No | Yes | Not estimating COMID effects, just accounting for correlation, repeated measures |
Stream |
Yes | Maybe | 8 streams to compare |
Year |
Yes | Maybe | 4 years to compare |
In this case, we certainly will need at least COMID as a random effect to account for the repeated measures. We would like stream as a fixed effect to compare average effects across streams, and year as a fixed effect to compare average effects across years. Though the later maybe be better included as a random effect, as it is not a treatment effect, but rather a random sample of years.
In mixed-effects models, it’s generally recommended to determine the fixed effects structure before deciding on the random effects structure. This approach helps ensure that information intended for fixed effects isn’t unintentionally absorbed by random effects.
We conducted a comprehensive brute-force model selection exercise, fitting 31 additive linear models that combined various combinations of predictors: temp_90, stream, year, mean_elevation, and slope:
# All combinations of 1–5 fixed effects: 31 total
m1 <- lm(yday ~ temp_90, data = df_mod)
m2 <- lm(yday ~ stream, data = df_mod)
m3 <- lm(yday ~ year, data = df_mod)
m4 <- lm(yday ~ mean_elevation, data = df_mod)
m5 <- lm(yday ~ slope, data = df_mod)
m6 <- lm(yday ~ temp_90 + stream, data = df_mod)
m7 <- lm(yday ~ temp_90 + year, data = df_mod)
m8 <- lm(yday ~ temp_90 + mean_elevation, data = df_mod)
m9 <- lm(yday ~ temp_90 + slope, data = df_mod)
m10 <- lm(yday ~ stream + year, data = df_mod)
m11 <- lm(yday ~ stream + mean_elevation, data = df_mod)
m12 <- lm(yday ~ stream + slope, data = df_mod)
m13 <- lm(yday ~ year + mean_elevation, data = df_mod)
m14 <- lm(yday ~ year + slope, data = df_mod)
m15 <- lm(yday ~ mean_elevation + slope, data = df_mod)
m16 <- lm(yday ~ temp_90 + stream + year, data = df_mod)
m17 <- lm(yday ~ temp_90 + stream + mean_elevation, data = df_mod)
m18 <- lm(yday ~ temp_90 + stream + slope, data = df_mod)
m19 <- lm(yday ~ temp_90 + year + mean_elevation, data = df_mod)
m20 <- lm(yday ~ temp_90 + year + slope, data = df_mod)
m21 <- lm(yday ~ temp_90 + mean_elevation + slope, data = df_mod)
m22 <- lm(yday ~ stream + year + mean_elevation, data = df_mod)
m23 <- lm(yday ~ stream + year + slope, data = df_mod)
m24 <- lm(yday ~ stream + mean_elevation + slope, data = df_mod)
m25 <- lm(yday ~ year + mean_elevation + slope, data = df_mod)
m26 <- lm(yday ~ temp_90 + stream + year + mean_elevation, data = df_mod)
m27 <- lm(yday ~ temp_90 + stream + year + slope, data = df_mod)
m28 <- lm(yday ~ temp_90 + stream + mean_elevation + slope, data = df_mod)
m29 <- lm(yday ~ temp_90 + year + mean_elevation + slope, data = df_mod)
m30 <- lm(yday ~ stream + year + mean_elevation + slope, data = df_mod)
m31 <- lm(yday ~ temp_90 + stream + year + mean_elevation + slope, data = df_mod)
Overall, the best-fitting model was m31 (temp_90 + stream + year + mean_elevation + slope), though its improvement in AIC (ΔAIC = 82.3) and R² (0.838) over simpler models was modest (Table 6 and 7).
Models excluding elevation and/or slope (e.g., m16: temp_90 + stream + year) performed nearly as well (R² = 0.816), indicating that these terms contributed little to model explanatory power.
| Model | df | AIC | delta |
|---|---|---|---|
| m31 | 15 | 15907.01 | 0.00000 |
| m26 | 14 | 15989.35 | 82.33843 |
| m27 | 14 | 16231.24 | 324.23284 |
| m16 | 13 | 16285.23 | 378.22148 |
| m28 | 12 | 16982.41 | 1075.40461 |
| Name | Model | R2 | R2_adjusted | RMSE | AIC_wt | BIC_wt | Performance_Score |
|---|---|---|---|---|---|---|---|
| m31 | lm | 0.8378549 | 0.8371527 | 3.364204 | 1 | 1 | 1.0000000 |
| m26 | lm | 0.8332567 | 0.8325904 | 3.411572 | 0 | 0 | 0.5959131 |
| m27 | lm | 0.8193324 | 0.8186104 | 3.551163 | 0 | 0 | 0.5836523 |
| m16 | lm | 0.8159471 | 0.8152732 | 3.584278 | 0 | 0 | 0.5807192 |
| m28 | lm | 0.7679280 | 0.7671557 | 4.024776 | 0 | 0 | 0.5400984 |
Although adding elevation (e.g., m26) improved AIC compared to the base model (m16), the effect estimate for elevation was consistently opposite to ecological expectations—higher elevations predicted later spawn dates (Figure 16 B)) despite the observed negative relationship between elevation and both temperature and spawn timing for several streams (Figure 12). This suggests potential collinearity or confounding, likely due to the inverse relationship between temperature and elevation among streams.
Figure 16: Predicted relationship (red line) between spawn date (yday) and (A) temp_90, (B) mean_elevation, and slope for the top additive linear model (m31).
Finally, slope consistently exhibited minimal impact on model performance, and its effect estimate remained small and sometimes non-significant. Thus, we conclude that temp_90, stream, and year are the most consistent predictors of spawn timing.
Based on model parsimony, interpretability, and consistency with ecological understanding, we selected m16 (temp_90 + stream + year) as the best baseline model to advance into mixed-effects modeling, acknowledging that elevation might be considered cautiously in future hierarchical models if appropriate.
We will now compare the best base additive model (m16, yday ~ temp_90 + stream + year) to a series of models with increasing complexity to evaluate the contribution of each fixed effect and random effect structure.
We will start with a random intercept model to account for the repeated measures on COMIDs. This is a baseline mixed model that will be compared to the additive linear model (m16). We will also compare to model m26 (elevation included as a fixed effect) with COMID random intercepts to see if it improves performance.
m16_re <- lmer(yday ~ temp_90 + stream + year + (1 | COMID), data = df_mod, REML = FALSE)
m26_re <- lmer(yday ~ temp_90 + stream + year + mean_elevation + (1 | COMID), data = df_mod, REML = FALSE)
Adding a random intercept for COMID significantly improved model performance, reducing AIC by >2,700 points compared to the additive fixed-effect model (m16 vs. m16_re; Table 8). This confirms the necessity of accounting for repeated measures and reach-specific baseline differences in spawn timing.
Adding mean elevation as a fixed effect (m26_re) yielded a further AIC improvement of 126 points over m16_re. However, this is substantially smaller than the apparent benefit observed in the fixed-effects-only comparison (296 AIC points; m26 vs. m16), suggesting that much of elevation’s explanatory power is absorbed by the random intercepts.
| Name | Model | RMSE | R2_conditional | R2_marginal | ICC | R2 | R2_adjusted | AIC_wt | BIC_wt | Performance_Score |
|---|---|---|---|---|---|---|---|---|---|---|
| m26_re | lmerMod | 2.047765 | 0.9572783 | 0.7590988 | 0.8226590 | NA | NA | 1 | 1 | 0.9997827 |
| m16_re | lmerMod | 2.046763 | 0.9818897 | 0.7051542 | 0.9385771 | NA | NA | 0 | 0 | 0.3333333 |
| m26 | lm | 3.411572 | NA | NA | NA | 0.8332567 | 0.8325904 | 0 | 0 | 0.0374426 |
| m16 | lm | 3.584278 | NA | NA | NA | 0.8159471 | 0.8152732 | 0 | 0 | 0.0000000 |
Moreover, the parameter estimate for elevation remained positive (Table 9) and counter to biological expectations (i.e., higher elevation reaches spawning later), and the associated prediction plot (Figure 17C) showed a weak and potentially misleading relationship.
| m16 | m26 | m16_re: random intercepts | m26_re: random intercepts + elevation | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 237.80 | 237.42 – 238.18 | <0.001 | 233.73 | 233.16 – 234.31 | <0.001 | 234.23 | 229.39 – 239.06 | <0.001 | 223.56 | 220.51 – 226.62 | <0.001 |
| temp 90 | 7.18 | 7.01 – 7.36 | <0.001 | 9.41 | 9.11 – 9.70 | <0.001 | 15.30 | 15.03 – 15.56 | <0.001 | 15.34 | 15.08 – 15.60 | <0.001 |
| stream [Beaver] | 8.11 | 7.47 – 8.74 | <0.001 | 9.68 | 9.05 – 10.31 | <0.001 | 20.38 | 12.62 – 28.14 | <0.001 | 15.53 | 11.18 – 19.89 | <0.001 |
| stream [Big] | 0.86 | 0.42 – 1.30 | <0.001 | 10.96 | 9.77 – 12.16 | <0.001 | -2.30 | -8.27 – 3.67 | 0.450 | 34.73 | 28.84 – 40.62 | <0.001 |
| stream [Camas] | 2.15 | 1.64 – 2.66 | <0.001 | 8.86 | 7.97 – 9.75 | <0.001 | 3.42 | -2.87 – 9.70 | 0.287 | 25.48 | 20.91 – 30.05 | <0.001 |
| stream [Elk] | -0.25 | -0.74 – 0.23 | 0.306 | 1.20 | 0.71 – 1.69 | <0.001 | 11.02 | 2.89 – 19.14 | 0.008 | 9.10 | 4.58 – 13.61 | <0.001 |
| stream [Loon] | 5.08 | 4.52 – 5.65 | <0.001 | 11.90 | 10.97 – 12.83 | <0.001 | 0.19 | -5.90 – 6.27 | 0.952 | 26.35 | 21.53 – 31.18 | <0.001 |
| stream [Marsh] | 6.17 | 5.68 – 6.66 | <0.001 | 4.97 | 4.49 – 5.45 | <0.001 | 6.89 | 0.31 – 13.47 | 0.040 | 4.66 | 0.97 – 8.34 | 0.013 |
| stream [Sulphur] | 4.95 | 4.32 – 5.58 | <0.001 | 13.17 | 12.07 – 14.26 | <0.001 | 20.86 | 14.00 – 27.71 | <0.001 | 33.47 | 29.25 – 37.69 | <0.001 |
| year [2003] | -2.84 | -3.15 – -2.52 | <0.001 | -3.78 | -4.09 – -3.46 | <0.001 | -5.62 | -5.82 – -5.41 | <0.001 | -5.63 | -5.84 – -5.42 | <0.001 |
| year [2004] | 1.73 | 1.31 – 2.15 | <0.001 | 2.15 | 1.75 – 2.55 | <0.001 | 2.87 | 2.62 – 3.13 | <0.001 | 2.89 | 2.63 – 3.14 | <0.001 |
| year [2005] | 3.91 | 3.38 – 4.45 | <0.001 | 3.94 | 3.42 – 4.45 | <0.001 | 4.28 | 3.95 – 4.60 | <0.001 | 4.28 | 3.95 – 4.60 | <0.001 |
| mean elevation | 4.64 | 4.13 – 5.16 | <0.001 | 14.89 | 12.93 – 16.85 | <0.001 | ||||||
| Random Effects | ||||||||||||
| σ2 | 4.34 | 4.34 | ||||||||||
| τ00 | 66.27 COMID | 20.12 COMID | ||||||||||
| ICC | 0.94 | 0.82 | ||||||||||
| N | 104 COMID | 104 COMID | ||||||||||
| Observations | 3016 | 3016 | 3016 | 3016 | ||||||||
| R2 / R2 adjusted | 0.816 / 0.815 | 0.833 / 0.833 | 0.705 / 0.982 | 0.759 / 0.957 | ||||||||
Figure 17: Predicted relationship (red line) between spawn date (yday) and (A) temp_90, and (B) mean_elevation for the random intercept model (m26_re).
This, combined with collinearity concerns and limited interpretability, led us to exclude mean elevation from further models. Subsequent model refinement focused on fixed effects of temperature, stream, and year, with COMID modeled as a random intercept or slope to capture spatial variation in thermal response.
We next tested whether a nonlinear temperature response improved model fit by adding a quadratic term for temp_90 to the random intercept model (m16_re).
m16_re_quad <- lmer(
yday ~ temp_90 + I(temp_90^2) + stream + year + (1 | COMID),
data = df_mod, REML = FALSE
)
This resulted in a lower AIC (ΔAIC = 91.7; Table 10), indicating substantial support for the quadratic formulation (m16_re_quad). The fitted curve reflects biologically plausible curvature — a decelerating response at higher temperatures — consistent with expectations for thermal constraints on salmonid spawning (Figure 18C).
We therefore retained the quadratic term for temperature in all subsequent models.
| Model | df | AIC | delta |
|---|---|---|---|
| m16_re_quad | 15 | 13473.12 | 0.00000 |
| m16_re | 14 | 13564.49 | 91.36939 |
Figure 18: Predicted relationship (red line) between spawn date (yday) and temp_90 for the variations on m16.
temp_90To account for variation in temperature sensitivity across sites, we extended our top random intercept model (m16_re_quad) by allowing COMID-specific random slopes for temperature (m16_re_quad_rs). This model had the same fixed effects structure but included (1 + temp_90 | COMID).
m16_re_quad <- lmer(
yday ~ temp_90 + I(temp_90^2) + stream + year + (1 | COMID),
data = df_mod, REML = TRUE
)
m16_re_quad_rs <- lmer(
yday ~ temp_90 + I(temp_90^2) + stream + year + (1 + temp_90 | COMID),
data = df_mod, REML = TRUE
)
The addition of random slopes substantially improved model fit (ΔAIC = 510; Table 11), and diagnostic plots confirmed adequate model behavior.
Adding random slopes increased model flexibility, which redistributed explained variance from fixed to random effects. As a result, marginal R² (variance explained by fixed effects) decreased from 0.726 to 0.701, while conditional R² (total variance explained) remained high at 0.98 in both models (Table 12. This tradeoff reflects a more realistic partitioning of variance, acknowledging that some differences in thermal sensitivity are site-specific and better explained by random effects.
| Model | df | AIC | delta |
|---|---|---|---|
| m16_re_quad_rs | 17 | 12948.76 | 0.0000 |
| m16_re_quad | 15 | 13459.05 | 510.2834 |
| m16_re_quad | m16_re_quad_rs | |||||
|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 234.77 | 230.24 – 239.30 | <0.001 | 235.09 | 230.95 – 239.22 | <0.001 |
| temp 90 | 14.74 | 14.46 – 15.02 | <0.001 | 13.90 | 13.06 – 14.74 | <0.001 |
| temp 90^2 | -0.65 | -0.78 – -0.52 | <0.001 | -0.85 | -1.15 – -0.55 | <0.001 |
| stream [Beaver] | 19.48 | 12.22 – 26.75 | <0.001 | 17.41 | 10.62 – 24.19 | <0.001 |
| stream [Big] | -0.93 | -6.53 – 4.67 | 0.745 | -0.19 | -5.37 – 5.00 | 0.944 |
| stream [Camas] | 3.65 | -2.23 – 9.54 | 0.224 | 2.30 | -3.08 – 7.69 | 0.402 |
| stream [Elk] | 10.52 | 2.91 – 18.13 | 0.007 | 11.97 | 4.88 – 19.06 | 0.001 |
| stream [Loon] | 1.07 | -4.63 – 6.78 | 0.713 | 2.80 | -2.47 – 8.06 | 0.298 |
| stream [Marsh] | 6.72 | 0.56 – 12.88 | 0.033 | 7.93 | 2.30 – 13.56 | 0.006 |
| stream [Sulphur] | 20.41 | 13.99 – 26.83 | <0.001 | 14.69 | 8.69 – 20.69 | <0.001 |
| year [2003] | -5.14 | -5.36 – -4.91 | <0.001 | -5.00 | -5.23 – -4.77 | <0.001 |
| year [2004] | 2.76 | 2.51 – 3.02 | <0.001 | 2.77 | 2.53 – 3.01 | <0.001 |
| year [2005] | 4.18 | 3.86 – 4.51 | <0.001 | 3.68 | 3.38 – 3.98 | <0.001 |
| Random Effects | ||||||
| σ2 | 4.24 | 3.36 | ||||
| τ00 | 58.03 COMID | 49.73 COMID | ||||
| τ11 | 12.63 COMID.temp_90 | |||||
| ρ01 | -0.23 COMID | |||||
| ICC | 0.93 | 0.95 | ||||
| N | 104 COMID | 104 COMID | ||||
| Observations | 3016 | 3016 | ||||
| Marginal R2 / Conditional R2 | 0.714 / 0.981 | 0.698 / 0.985 | ||||
Here, we compare best random slope model with and without a quadratic effect. This tests: does random slope replace the need for quadratic, or do both help? We must refit with ML to compare models with different fixed effects.
m16_re_quad_rs <- lmer(
yday ~ temp_90 + I(temp_90^2) + stream + year + (1 + temp_90 | COMID),
data = df_mod, REML = FALSE
)
m16_re_linear_rs <- lmer(
yday ~ temp_90 + stream + year + (1 + temp_90 | COMID),
data = df_mod, REML = FALSE
)
| Model | df | AIC | delta |
|---|---|---|---|
| m16_re_quad_rs | 17 | 12965.97 | 0.00000 |
| m16_re_linear_rs | 16 | 12986.41 | 20.44148 |
Figure 19: Model predictions (red lines) for m16_re_quad_rs and m16_re_linear_rs.
The quadratic term improves fit modestly (ΔAIC = 20.4) while keeping the structure stable. Most of the model’s explanatory power is coming from the random slopes, not the curvature of temperature. But the quadratic does meaningfully refine the relationship — especially at the tails of the temperature gradient (Figure 19) — without overfitting or destabilizing the model. This model is supported. This structure offered a strong balance between flexibility and interpretability and was retained for subsequent analysis.
We now have built a strong foundation showing that COMID-level variation captures spatial structure in the data (m16_re_quad_rs).
So now it does make sense to check interactions, with this rationale:
Even though random slopes already absorb much of the temperature-related heterogeneity, interactions can help us test whether average responses differ systematically across stream or year, beyond what’s explained by the random effects.
Logical interactions to test:
temp_90 * stream – Does the average thermal slope vary among streams?temp_90 * year – Does thermal sensitivity vary across years (e.g., wetter vs. drier)?We will now test interactions and compare them to m16_re_quad_rs.
m201 <- lmer(yday ~ temp_90 * stream + I(temp_90^2) + year + (1 + temp_90 | COMID), data = df_mod, REML = FALSE)
m202 <- lmer(yday ~ temp_90 * year + I(temp_90^2) + stream + (1 + temp_90 | COMID), data = df_mod, REML = FALSE)
Table 14 shows that m201 lowers AIC by 18, while m202 lowers AIC by 1397, a massive improvement.
Table 15 shows that the interaction terms in m201 are mostly non-significant, and the model R² values are nearly identical to m16_re_quad_rs.
| Model | df | AIC | delta |
|---|---|---|---|
| m202 | 20 | 11568.63 | 0.000 |
| m201 | 24 | 12948.51 | 1379.886 |
| m16_re_quad_rs | 17 | 12965.97 | 1397.340 |
| m16_re_quad_rs | m201: temp_90 * stream | m202: temp_90 * year | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Predictors | Estimates | CI | p | Estimates | CI | p | Estimates | CI | p |
| (Intercept) | 235.12 | 231.18 – 239.05 | <0.001 | 234.55 | 230.27 – 238.83 | <0.001 | 232.86 | 226.46 – 239.25 | <0.001 |
| temp 90 | 13.85 | 13.02 – 14.68 | <0.001 | 14.99 | 13.22 – 16.76 | <0.001 | 18.84 | 17.87 – 19.80 | <0.001 |
| temp 90^2 | -0.86 | -1.16 – -0.56 | <0.001 | -1.12 | -1.43 – -0.82 | <0.001 | 1.53 | 1.23 – 1.83 | <0.001 |
| stream [Beaver] | 17.37 | 10.90 – 23.84 | <0.001 | 17.93 | 11.04 – 24.83 | <0.001 | 21.60 | 11.35 – 31.85 | <0.001 |
| stream [Big] | -0.17 | -5.11 – 4.78 | 0.947 | -1.07 | -6.48 – 4.33 | 0.697 | -3.78 | -11.78 – 4.23 | 0.355 |
| stream [Camas] | 2.29 | -2.84 – 7.42 | 0.381 | 2.00 | -3.59 – 7.59 | 0.483 | 1.82 | -6.51 – 10.14 | 0.668 |
| stream [Elk] | 11.88 | 5.12 – 18.64 | 0.001 | 11.74 | 4.53 – 18.95 | 0.001 | 15.89 | 5.13 – 26.64 | 0.004 |
| stream [Loon] | 2.85 | -2.17 – 7.87 | 0.266 | 3.43 | -2.21 – 9.06 | 0.233 | 1.71 | -6.48 – 9.89 | 0.683 |
| stream [Marsh] | 7.91 | 2.55 – 13.27 | 0.004 | 9.15 | 3.35 – 14.96 | 0.002 | 9.50 | 0.82 – 18.18 | 0.032 |
| stream [Sulphur] | 14.69 | 8.97 – 20.42 | <0.001 | 15.37 | 9.26 – 21.49 | <0.001 | 24.50 | 15.44 – 33.56 | <0.001 |
| year [2003] | -4.99 | -5.22 – -4.76 | <0.001 | -4.96 | -5.19 – -4.73 | <0.001 | -7.03 | -7.23 – -6.83 | <0.001 |
| year [2004] | 2.76 | 2.52 – 3.00 | <0.001 | 2.78 | 2.53 – 3.02 | <0.001 | 2.96 | 2.77 – 3.15 | <0.001 |
| year [2005] | 3.68 | 3.38 – 3.98 | <0.001 | 3.69 | 3.39 – 4.00 | <0.001 | 3.75 | 3.51 – 3.98 | <0.001 |
| temp 90 × stream [Beaver] | -2.81 | -5.85 – 0.23 | 0.070 | ||||||
| temp 90 × stream [Big] | 0.83 | -1.40 – 3.06 | 0.466 | ||||||
| temp 90 × stream [Camas] | 0.99 | -1.42 – 3.39 | 0.421 | ||||||
| temp 90 × stream [Elk] | 0.68 | -2.30 – 3.66 | 0.656 | ||||||
| temp 90 × stream [Loon] | -1.29 | -3.87 – 1.28 | 0.325 | ||||||
| temp 90 × stream [Marsh] | -3.96 | -6.48 – -1.44 | 0.002 | ||||||
|
temp 90 × stream [Sulphur] |
-5.05 | -7.75 – -2.35 | <0.001 | ||||||
| temp 90 × year [2003] | -3.89 | -4.08 – -3.70 | <0.001 | ||||||
| temp 90 × year [2004] | 0.63 | 0.40 – 0.87 | <0.001 | ||||||
| temp 90 × year [2005] | -1.29 | -1.65 – -0.93 | <0.001 | ||||||
| Random Effects | |||||||||
| σ2 | 3.36 | 3.37 | 1.96 | ||||||
| τ00 | 45.06 COMID | 50.42 COMID | 114.83 COMID | ||||||
| τ11 | 12.18 COMID.temp_90 | 6.52 COMID.temp_90 | 17.85 COMID.temp_90 | ||||||
| ρ01 | -0.23 COMID | -0.35 COMID | 0.00 COMID | ||||||
| ICC | 0.94 | 0.94 | 0.99 | ||||||
| N | 104 COMID | 104 COMID | 104 COMID | ||||||
| Observations | 3016 | 3016 | 3016 | ||||||
| Marginal R2 / Conditional R2 | 0.713 / 0.984 | 0.736 / 0.985 | 0.620 / 0.994 | ||||||
For m202, while the interactions terms are significant, the Marginal R² value drop by ~10%.
Figure 20 shows that the predicted effects of temperature are implausible, with a positive quadratic effect of temperature. This suggests that the model is overfitting or confounding the interaction structure.
Figure 20: Predicted spawn timing.
To evaluate whether fixed-effect interactions improve model performance beyond what is captured by random slopes, we tested two candidate models: one with a temp_90 × stream interaction (m201) and one with a temp_90 × year interaction (m202), comparing them against our baseline random slope model with a quadratic temperature effect (m16_re_quad_rs).
Model m202 showed a large AIC improvement (ΔAIC = -1397), but inspection of its predicted effects revealed implausible curvature—specifically, a biologically unlikely inverted quadratic effect of temperature. This suggests overfitting or confounding in the interaction structure.
In contrast, m201 showed moderate AIC improvement over the baseline model (ΔAIC = 18), with predictions that remained biologically realistic. However, nearly all interaction terms in m201 were non-significant except one, and model R² values and parameter estimates remained essentially unchanged.
These results indicate that most stream-specific variation in temperature responses is already accounted for by random slopes. Thus, the added complexity of fixed-effect interactions does not yield substantial inferential gains and is not justified. We therefore retained m16_re_quad_rs as our final model.
The final model included a linear and quadratic effect of temperature (temp_90, I(temp_90^2)), additive fixed effects for stream and year, and a random intercept and slope for temperature by COMID:
mod_final <- lmer(yday ~ temp_90 + I(temp_90^2) + stream + year + (1 + temp_90 | COMID),
data = df_mod, REML = TRUE)
This model balances explanatory power, biological realism, and parsimony. It captures both general patterns in spawn timing (via fixed effects) and local deviations in temperature response (via random slopes), and will serve as the basis for diagnostics, prediction, and ecological interpretation.
Parameter estimates (Table 16) indicate a significant nonlinear effect of temperature on spawn timing, with the quadratic term (–0.85) confirming a concave-down relationship—i.e., spawn timing advances with increasing temperature, but levels off at high temperatures.
Most stream and year effects were significant, capturing expected spatiotemporal structure in spawn timing. Notably, Big, Camas, and Loon were not significantly different from the reference level (Bear Valley).
The random effects structure reveals substantial variation across COMIDs. The standard deviation for random intercepts (√τ₀₀ = 7.05) and random slopes for temp_90 (√τ₁₁ = 3.55) indicates strong spatial heterogeneity in both baseline timing and temperature sensitivity (Table 16). The intraclass correlation (ICC) was high (0.95), reflecting strong grouping structure at the COMID level (Table 17).
Model fit was excellent: the marginal R² (variance explained by fixed effects) was 0.698, and the conditional R² (fixed + random effects) was 0.985 (Table 17). This suggests that the majority of explanatory power is derived from spatially varying thermal responses captured by the random slopes, with additional structure provided by the fixed effects.
| Parameter | Coefficient | SE | CI | CI_low | CI_high | t | df_error | p | Effects | Group |
|---|---|---|---|---|---|---|---|---|---|---|
| (Intercept) | 235.09 | 2.11 | 0.95 | 230.95 | 239.22 | 111.50 | 2999 | 0.00 | fixed | |
| temp_90 | 13.90 | 0.43 | 0.95 | 13.06 | 14.74 | 32.50 | 2999 | 0.00 | fixed | |
| I(temp_90^2) | -0.85 | 0.15 | 0.95 | -1.15 | -0.55 | -5.54 | 2999 | 0.00 | fixed | |
| streamBeaver | 17.41 | 3.46 | 0.95 | 10.62 | 24.19 | 5.03 | 2999 | 0.00 | fixed | |
| streamBig | -0.19 | 2.65 | 0.95 | -5.37 | 5.00 | -0.07 | 2999 | 0.94 | fixed | |
| streamCamas | 2.30 | 2.75 | 0.95 | -3.08 | 7.69 | 0.84 | 2999 | 0.40 | fixed | |
| streamElk | 11.97 | 3.61 | 0.95 | 4.88 | 19.06 | 3.31 | 2999 | 0.00 | fixed | |
| streamLoon | 2.80 | 2.68 | 0.95 | -2.47 | 8.06 | 1.04 | 2999 | 0.30 | fixed | |
| streamMarsh | 7.93 | 2.87 | 0.95 | 2.30 | 13.56 | 2.76 | 2999 | 0.01 | fixed | |
| streamSulphur | 14.69 | 3.06 | 0.95 | 8.69 | 20.69 | 4.80 | 2999 | 0.00 | fixed | |
| year2003 | -5.00 | 0.12 | 0.95 | -5.23 | -4.77 | -42.79 | 2999 | 0.00 | fixed | |
| year2004 | 2.77 | 0.12 | 0.95 | 2.53 | 3.01 | 22.57 | 2999 | 0.00 | fixed | |
| year2005 | 3.68 | 0.15 | 0.95 | 3.38 | 3.98 | 24.00 | 2999 | 0.00 | fixed | |
| SD (Intercept) | 7.05 | NA | 0.95 | NA | NA | NA | NA | NA | random | COMID |
| SD (temp_90) | 3.55 | NA | 0.95 | NA | NA | NA | NA | NA | random | COMID |
| Cor (Intercept~temp_90) | -0.23 | NA | 0.95 | NA | NA | NA | NA | NA | random | COMID |
| SD (Observations) | 1.83 | NA | 0.95 | NA | NA | NA | NA | NA | random | Residual |
| AIC | AICc | BIC | R2_conditional | R2_marginal | ICC | RMSE | Sigma |
|---|---|---|---|---|---|---|---|
| 12948.76 | 12948.97 | 13050.96 | 0.9845502 | 0.6979933 | 0.9488428 | 1.780765 | 1.833613 |
Model diagnostics for the final model were generally strong (Figure 21). The posterior predictive check shows excellent agreement between the observed and model-predicted distributions of spawn timing, with overlapping density curves and no major deviations.
Plots of linearity and homoscedasticity indicate acceptable model performance. While the residual vs. fitted plot shows a slight trend, it does not appear strong enough to invalidate model assumptions. The spread of residuals is approximately constant across fitted values, though there is minor funneling at the lower end, likely reflecting skew in early spawn dates.
Influential observations are limited. A few data points exceed standard influence thresholds (|standardized residual| > 2 and high leverage), but none are extreme enough to warrant removal, and their leverage is modest. The model appears robust to outliers.
Collinearity is low: variance inflation factors (VIFs) for all fixed effects are well below the conservative threshold of 5, suggesting little concern about multicollinearity.
The normal Q-Q plot of residuals shows slight right-skew and some heavy tails, but the distribution is reasonably close to normal. Similarly, random effect Q-Q plots for intercepts and slopes show mostly linear trends with slight deviation at the tails, indicating acceptable assumptions for the mixed-effects structure.
Overall, the model shows no violations of key assumptions and is suitable for inference and prediction.
Figure 21: Residual diagnostics for mod_final.
Model predictions closely matched observed spawn timing, with predicted values aligning well along the 1:1 line (Figure 22). This supports strong overall model fit.
Figure 22: Predicted vs. observed spawn timing for Chinook Salmon across all years and streams. The dashed line shows the 1:1 line. Points represent individual redd observations.
yday at each factor levelWe estimated marginal mean of yday at each factor level, averaging over the random effects, to provide an overall estimate of the effect in the population (Figure 23).
Figure 23: Predicted mean spawn dates by stream and year (A), stream (B), and year (C) from the final mixed-effects model (yday ~ temp_90 + I(temp_90^2) + stream + year + (1 + temp_90 | COMID)). Colored points with horizontal lines (A) and black points with black lines (B, C) represent eatimated marginal means and 95% confidence intervals. Background boxplots in panels B and C show the distribution of observed redd counts by group, with individual data points in grey. The modeled predictions represent marginal means accounting for fixed effects and averaged over random effects.
Significant differences in spawn timing were observed between many stream pairs (Table 18), particularly involving Loon (later spawning) and Sulphur (earlier spawning). For example, fish in Loon spawned significantly later than in Bear Valley, Camas, and Elk, while Sulphur exhibited significantly earlier timing than all other streams except Elk. These patterns reflect spatial heterogeneity in temperature and elevation across streams that is not fully captured by fixed effects alone.
| Level1 | Level2 | Difference | SE | CI_low | CI_high | t | df | p |
|---|---|---|---|---|---|---|---|---|
| Beaver | Bear Valley | -1.43 | 3.47 | -8.23 | 5.38 | -0.41 | 2999 | 0.68 |
| Big | Bear Valley | -2.66 | 2.66 | -7.88 | 2.56 | -1.00 | 2999 | 0.32 |
| Camas | Bear Valley | 0.63 | 2.75 | -4.77 | 6.03 | 0.23 | 2999 | 0.82 |
| Elk | Bear Valley | -6.60 | 3.62 | -13.71 | 0.50 | -1.82 | 2999 | 0.07 |
| Loon | Bear Valley | 8.90 | 2.68 | 3.64 | 14.15 | 3.32 | 2999 | 0.00 |
| Marsh | Bear Valley | 6.72 | 2.87 | 1.08 | 12.35 | 2.34 | 2999 | 0.02 |
| Sulphur | Bear Valley | -9.47 | 3.16 | -15.66 | -3.28 | -3.00 | 2999 | 0.00 |
| Big | Beaver | -1.23 | 3.18 | -7.46 | 4.99 | -0.39 | 2999 | 0.70 |
| Camas | Beaver | 2.06 | 3.27 | -4.35 | 8.46 | 0.63 | 2999 | 0.53 |
| Elk | Beaver | -5.18 | 4.01 | -13.04 | 2.69 | -1.29 | 2999 | 0.20 |
| Loon | Beaver | 10.33 | 3.24 | 3.98 | 16.67 | 3.19 | 2999 | 0.00 |
| Marsh | Beaver | 8.14 | 3.41 | 1.46 | 14.82 | 2.39 | 2999 | 0.02 |
| Sulphur | Beaver | -8.04 | 3.54 | -14.98 | -1.10 | -2.27 | 2999 | 0.02 |
| Camas | Big | 3.29 | 2.40 | -1.42 | 8.00 | 1.37 | 2999 | 0.17 |
| Elk | Big | -3.94 | 3.35 | -10.51 | 2.62 | -1.18 | 2999 | 0.24 |
| Loon | Big | 11.56 | 2.35 | 6.95 | 16.16 | 4.92 | 2999 | 0.00 |
| Marsh | Big | 9.38 | 2.57 | 4.33 | 14.42 | 3.64 | 2999 | 0.00 |
| Sulphur | Big | -6.81 | 2.78 | -12.26 | -1.36 | -2.45 | 2999 | 0.01 |
| Elk | Camas | -7.23 | 3.43 | -13.97 | -0.50 | -2.11 | 2999 | 0.04 |
| Loon | Camas | 8.27 | 2.45 | 3.47 | 13.06 | 3.38 | 2999 | 0.00 |
| Marsh | Camas | 6.09 | 2.66 | 0.87 | 11.30 | 2.29 | 2999 | 0.02 |
| Sulphur | Camas | -10.10 | 2.91 | -15.80 | -4.40 | -3.47 | 2999 | 0.00 |
| Loon | Elk | 15.50 | 3.40 | 8.84 | 22.17 | 4.56 | 2999 | 0.00 |
| Marsh | Elk | 13.32 | 3.56 | 6.34 | 20.30 | 3.74 | 2999 | 0.00 |
| Sulphur | Elk | -2.87 | 3.71 | -10.14 | 4.41 | -0.77 | 2999 | 0.44 |
| Marsh | Loon | -2.18 | 2.57 | -7.23 | 2.87 | -0.85 | 2999 | 0.40 |
| Sulphur | Loon | -18.37 | 2.91 | -24.07 | -12.66 | -6.31 | 2999 | 0.00 |
| Sulphur | Marsh | -16.19 | 3.10 | -22.27 | -10.10 | -5.22 | 2999 | 0.00 |
There was a clear trend toward later spawning over the four-year period (Table 18). Spawning in 2005 occurred significantly later than in all previous years. Differences between 2002 and 2003 were not statistically significant, but later years (2004 and especially 2005) were associated with a progressive delay in mean spawn timing. This temporal shift likely reflects interannual variability in temperature and flow conditions.
| Level1 | Level2 | Difference | SE | CI_low | CI_high | t | df | p |
|---|---|---|---|---|---|---|---|---|
| 2003 | 2002 | 0.97 | 0.57 | -0.15 | 2.10 | 1.70 | 2999 | 0.09 |
| 2004 | 2002 | 2.35 | 0.65 | 1.07 | 3.62 | 3.61 | 2999 | 0.00 |
| 2005 | 2002 | 6.18 | 0.61 | 4.98 | 7.37 | 10.17 | 2999 | 0.00 |
| 2004 | 2003 | 1.37 | 0.61 | 0.18 | 2.56 | 2.26 | 2999 | 0.02 |
| 2005 | 2003 | 5.20 | 0.71 | 3.80 | 6.60 | 7.28 | 2999 | 0.00 |
| 2005 | 2004 | 3.83 | 0.87 | 2.11 | 5.54 | 4.38 | 2999 | 0.00 |
To visualize the model’s predictions across the temperature gradient, we estimated the relationship between spawn timing and 90-day pre-spawn mean temperature (temp_90) for different groupings: overall, by stream, and by year. This approach allows us to assess both general trends and context-specific responses.
Stream-specific predictions (see plot suggestion below) show a consistent pattern: spawn timing increases nonlinearly with 90-day mean stream temperature, leveling off at high temperatures. This plateau is consistent with biological expectations, as spawning may be constrained by environmental or physiological thresholds. Stream-to-stream variation in predicted timing reflects both fixed stream effects and COMID-specific random intercepts and slopes.
Year-specific predictions similarly show consistent thermal responses across years, with modest offsets in average spawn timing due to year effects. These predictions further validate the model’s generalizability and temporal consistency.
In addition, a combined stream-by-year plot (e.g., facet grid) confirms that the final model captures heterogeneity in both space and time without overfitting. Lines track the raw data closely across groups, particularly in mid-range temperatures where most observations are concentrated.
Together, these plots indicate that the final model effectively captures both nonlinear temperature effects and spatial variation in thermal sensitivity, while maintaining interpretability and predictive strength.
When stratified by stream and year, predictions captured both the nonlinear temperature response and variation in baseline spawn timing across sites and years (Figure 24). The curvature was consistent across years, while intercept shifts reflected known spatial patterns (e.g., later spawning in warmer, downstream reaches). These results confirm that the model accurately describes both average and context-specific phenological responses to temperature.
Figure 24: Predicted relationship between spawn timing and 90-day pre-spawn mean temperature by stream and year. Lines represent model predictions from the final mixed-effects model. Colored points show observed redd timing, shaded ribbons represent 95% confidence intervals for predictions.
Random intercepts and slopes varied considerably among COMIDs, reflecting spatial heterogeneity in both average spawn timing and thermal sensitivity (Figure 25A). We found considerable spread in intercepts, reflecting differences in average spawn timing between reaches. The random slopes for temp_90 likewise varied meaningfully across COMIDs, indicating that temperature–spawn timing relationships are not constant across space.
Sites with earlier average spawn timing (lower intercepts) generally exhibited stronger positive responses to temperature (higher slopes), while later-spawning sites tended to show weaker temperature effects, a pattern also evident in the weak negative correlation between intercepts and slopes (r = -0.2; Figure 25B). This indicates that reaches with earlier average spawn timing (i.e., negative intercepts) tend to exhibit stronger temperature sensitivity (i.e., steeper positive slopes), whereas later-spawning reaches show weaker responses to temperature. Biologically, this suggests that early-spawning populations are more phenologically plastic and adjust their timing more closely to thermal cues, while late-spawning populations may be constrained by other factors—such as photoperiod, migration fatigue, or compressed spawning windows—resulting in diminished thermal responsiveness. These findings highlight spatial heterogeneity in phenological flexibility, with potential implications for how different populations may respond to climate change.
Figure 25: (A) COMID-specific random parameter estimates for intercepts (left) and slopes (right). Points represent best linear unbiased predictions (BLUPs) from the final model, with horizontal bars indicating ±1.96 standard errors. (B) Correlation between random intercepts and slopes for 90-day temperature across COMIDs. Each point represents a stream reach (COMID).
When grouped by stream, Bear Valley and Big Creek exhibited early average spawn timing and high thermal sensitivity, while Sulphur and Marsh Creeks had later average timing with flatter temperature responses (Figure 26).
Figure 26: Boxplots of random intercepts and slopes for 90-day pre-spawn temperature by stream. Each box represents the distribution of best linear unbiased predictions (BLUPs) for a COMID’s random intercept (average spawn timing) or slope (thermal sensitivity). Bars are colored by stream.
However, when examing individual random effects for each COMID and stream, we observed considerable variation in both intercepts and slopes within streams (Figure 27). For example, Bear Valley and Big Creek had some of the earliest average spawn timings, but also exhibited a wide range of thermal sensitivities. In contrast, Sulphur Creek had later average spawn timing but also showed considerable variability in its response to temperature.
Figure 27: Random intercepts and slopes for 90-day pre-spawn temperature by stream. Each bar represents the best linear unbiased prediction (BLUP) for a COMID’s random intercept (average spawn timing) or slope (thermal sensitivity). Bars are colored by stream.
Variation in temperature–spawn timing relationships across stream reaches (COMIDs) as estimated from the final mixed-effects model is shown in Figure 28). While the overall relationship is positive and nonlinear, individual COMID slopes and intercepts vary considerably. Some reaches show steeper increases in spawn timing with temperature (i.e., stronger thermal sensitivity), while others are relatively flat, indicating a weaker or more buffered response. Grouping by stream shows that some streams (e.g., Bear Valley, Marsh) exhibit tightly clustered trajectories, while others (e.g., Camas, Big) show more divergence. This variation likely reflects fine-scale differences in local hydrology, geomorphology, or biological factors that influence how fish respond to thermal cues within stream systems. The consistency of the overall trend, despite local heterogeneity, supports the biological relevance of temperature in structuring spawn timing.
Figure 28: Predicted spawn timing by 90-day pre-spawn temperature and COMID. Each line represents the predicted spawn timing for a specific COMID, colored by stream. The black line and shaded ribbon represents the 95% confidence interval for the population-level predictions.
TO DISCUSS
Although mean elevation was excluded as a fixed effect due to collinearity and inconsistent directionality, plots of random effects against elevation reveal underlying spatial structure that elevation helps to explain (Figure 29).
In panel A, there is a clear positive relationship between elevation and the random intercepts for some streams (e.g., Big), indicating that higher elevation sites within Big Creek tend to have later average spawn timing compared to the global (population) average.
In panel B, random slopes (i.e., thermal sensitivity) show more idiosyncratic patterns: for some streams (e.g., Camas, Marsh), thermal sensitivity appears to decrease with elevation, suggesting fish at higher elevations may respond less strongly to temperature variability.
However, these relationships are inconsistent across streams, supporting the decision to capture this spatial heterogeneity via COMID-level random effects rather than forcing a global fixed elevation term.
Figure 29: Relationship between mean_elevation and (A) random intercepts and (B) slopes for 90-day pre-spawn temperature, as well as mean_elevation and (C) spawn date (yday) and (D) temp_90 . Points and lines are colored by stream, and solid lines represent linear fits.